我们通过将其基于实现功能空间而不是参数空间的几何形状来系统地研究深度神经网络景观的方法。将分类器分组到等效类中,我们开发了一个标准化的参数化,其中所有对称性都被删除,从而导致环形拓扑。在这个空间上,我们探讨了误差景观而不是损失。这使我们能够得出有意义的概念,即最小化器的平坦度和连接它们的地球通道的概念。使用不同的优化算法,这些算法采样具有不同平坦度的最小化器,我们研究模式连接性和相对距离。测试各种最先进的体系结构和基准数据集,我们确认了平面度和泛化性能之间的相关性;我们进一步表明,在功能空间中,minima彼此更近,并且连接它们的大地测量学的屏障很小。我们还发现,通过梯度下降的变体发现的最小化器可以通过由参数空间中的两个直线组成的零误差路径连接,即带有单个弯曲的多边形链。我们观察到具有二进制权重和激活的神经网络中相似的定性结果,这为在这种情况下的连通性提供了第一个结果之一。我们的结果取决于对称性的去除,并且与对简单浅层模型进行的一些分析研究所描述的丰富现象学非常吻合。
translated by 谷歌翻译
Process monitoring and control are essential in modern industries for ensuring high quality standards and optimizing production performance. These technologies have a long history of application in production and have had numerous positive impacts, but also hold great potential when integrated with Industry 4.0 and advanced machine learning, particularly deep learning, solutions. However, in order to implement these solutions in production and enable widespread adoption, the scalability and transferability of deep learning methods have become a focus of research. While transfer learning has proven successful in many cases, particularly with computer vision and homogenous data inputs, it can be challenging to apply to heterogeneous data. Motivated by the need to transfer and standardize established processes to different, non-identical environments and by the challenge of adapting to heterogeneous data representations, this work introduces the Domain Adaptation Neural Network with Cyclic Supervision (DBACS) approach. DBACS addresses the issue of model generalization through domain adaptation, specifically for heterogeneous data, and enables the transfer and scalability of deep learning-based statistical control methods in a general manner. Additionally, the cyclic interactions between the different parts of the model enable DBACS to not only adapt to the domains, but also match them. To the best of our knowledge, DBACS is the first deep learning approach to combine adaptation and matching for heterogeneous data settings. For comparison, this work also includes subspace alignment and a multi-view learning that deals with heterogeneous representations by mapping data into correlated latent feature spaces. Finally, DBACS with its ability to adapt and match, is applied to a virtual metrology use case for an etching process run on different machine types in semiconductor manufacturing.
translated by 谷歌翻译
An Anomaly Detection (AD) System for Self-diagnosis has been developed for Multiphase Flow Meter (MPFM). The system relies on machine learning algorithms for time series forecasting, historical data have been used to train a model and to predict the behavior of a sensor and, thus, to detect anomalies.
translated by 谷歌翻译
Building a quantum analog of classical deep neural networks represents a fundamental challenge in quantum computing. A key issue is how to address the inherent non-linearity of classical deep learning, a problem in the quantum domain due to the fact that the composition of an arbitrary number of quantum gates, consisting of a series of sequential unitary transformations, is intrinsically linear. This problem has been variously approached in the literature, principally via the introduction of measurements between layers of unitary transformations. In this paper, we introduce the Quantum Path Kernel, a formulation of quantum machine learning capable of replicating those aspects of deep machine learning typically associated with superior generalization performance in the classical domain, specifically, hierarchical feature learning. Our approach generalizes the notion of Quantum Neural Tangent Kernel, which has been used to study the dynamics of classical and quantum machine learning models. The Quantum Path Kernel exploits the parameter trajectory, i.e. the curve delineated by model parameters as they evolve during training, enabling the representation of differential layer-wise convergence behaviors, or the formation of hierarchical parametric dependencies, in terms of their manifestation in the gradient space of the predictor function. We evaluate our approach with respect to variants of the classification of Gaussian XOR mixtures - an artificial but emblematic problem that intrinsically requires multilevel learning in order to achieve optimal class separation.
translated by 谷歌翻译
Aliasing is a highly important concept in signal processing, as careful consideration of resolution changes is essential in ensuring transmission and processing quality of audio, image, and video. Despite this, up until recently aliasing has received very little consideration in Deep Learning, with all common architectures carelessly sub-sampling without considering aliasing effects. In this work, we investigate the hypothesis that the existence of adversarial perturbations is due in part to aliasing in neural networks. Our ultimate goal is to increase robustness against adversarial attacks using explainable, non-trained, structural changes only, derived from aliasing first principles. Our contributions are the following. First, we establish a sufficient condition for no aliasing for general image transformations. Next, we study sources of aliasing in common neural network layers, and derive simple modifications from first principles to eliminate or reduce it. Lastly, our experimental results show a solid link between anti-aliasing and adversarial attacks. Simply reducing aliasing already results in more robust classifiers, and combining anti-aliasing with robust training out-performs solo robust training on $L_2$ attacks with none or minimal losses in performance on $L_{\infty}$ attacks.
translated by 谷歌翻译
The problem of generating an optimal coalition structure for a given coalition game of rational agents is to find a partition that maximizes their social welfare and is known to be NP-hard. This paper proposes GCS-Q, a novel quantum-supported solution for Induced Subgraph Games (ISGs) in coalition structure generation. GCS-Q starts by considering the grand coalition as initial coalition structure and proceeds by iteratively splitting the coalitions into two nonempty subsets to obtain a coalition structure with a higher coalition value. In particular, given an $n$-agent ISG, the GCS-Q solves the optimal split problem $\mathcal{O} (n)$ times using a quantum annealing device, exploring $\mathcal{O}(2^n)$ partitions at each step. We show that GCS-Q outperforms the currently best classical solvers with its runtime in the order of $n^2$ and an expected worst-case approximation ratio of $93\%$ on standard benchmark datasets.
translated by 谷歌翻译
Anomaly Detection is a relevant problem that arises in numerous real-world applications, especially when dealing with images. However, there has been little research for this task in the Continual Learning setting. In this work, we introduce a novel approach called SCALE (SCALing is Enough) to perform Compressed Replay in a framework for Anomaly Detection in Continual Learning setting. The proposed technique scales and compresses the original images using a Super Resolution model which, to the best of our knowledge, is studied for the first time in the Continual Learning setting. SCALE can achieve a high level of compression while maintaining a high level of image reconstruction quality. In conjunction with other Anomaly Detection approaches, it can achieve optimal results. To validate the proposed approach, we use a real-world dataset of images with pixel-based anomalies, with the scope to provide a reliable benchmark for Anomaly Detection in the context of Continual Learning, serving as a foundation for further advancements in the field.
translated by 谷歌翻译
Digital media have enabled the access to unprecedented literary knowledge. Authors, readers, and scholars are now able to discover and share an increasing amount of information about books and their authors. Notwithstanding, digital archives are still unbalanced: writers from non-Western countries are less represented, and such a condition leads to the perpetration of old forms of discrimination. In this paper, we present the Under-Represented Writers Knowledge Graph (URW-KG), a resource designed to explore and possibly amend this lack of representation by gathering and mapping information about works and authors from Wikidata and three other sources: Open Library, Goodreads, and Google Books. The experiments based on KG embeddings showed that the integrated information encoded in the graph allows scholars and users to be more easily exposed to non-Western literary works and authors with respect to Wikidata alone. This opens to the development of fairer and effective tools for author discovery and exploration.
translated by 谷歌翻译
Algorithms that involve both forecasting and optimization are at the core of solutions to many difficult real-world problems, such as in supply chains (inventory optimization), traffic, and in the transition towards carbon-free energy generation in battery/load/production scheduling in sustainable energy systems. Typically, in these scenarios we want to solve an optimization problem that depends on unknown future values, which therefore need to be forecast. As both forecasting and optimization are difficult problems in their own right, relatively few research has been done in this area. This paper presents the findings of the ``IEEE-CIS Technical Challenge on Predict+Optimize for Renewable Energy Scheduling," held in 2021. We present a comparison and evaluation of the seven highest-ranked solutions in the competition, to provide researchers with a benchmark problem and to establish the state of the art for this benchmark, with the aim to foster and facilitate research in this area. The competition used data from the Monash Microgrid, as well as weather data and energy market data. It then focused on two main challenges: forecasting renewable energy production and demand, and obtaining an optimal schedule for the activities (lectures) and on-site batteries that lead to the lowest cost of energy. The most accurate forecasts were obtained by gradient-boosted tree and random forest models, and optimization was mostly performed using mixed integer linear and quadratic programming. The winning method predicted different scenarios and optimized over all scenarios jointly using a sample average approximation method.
translated by 谷歌翻译
Deep Reinforcement Learning is emerging as a promising approach for the continuous control task of robotic arm movement. However, the challenges of learning robust and versatile control capabilities are still far from being resolved for real-world applications, mainly because of two common issues of this learning paradigm: the exploration strategy and the slow learning speed, sometimes known as "the curse of dimensionality". This work aims at exploring and assessing the advantages of the application of Quantum Computing to one of the state-of-art Reinforcement Learning techniques for continuous control - namely Soft Actor-Critic. Specifically, the performance of a Variational Quantum Soft Actor-Critic on the movement of a virtual robotic arm has been investigated by means of digital simulations of quantum circuits. A quantum advantage over the classical algorithm has been found in terms of a significant decrease in the amount of required parameters for satisfactory model training, paving the way for further promising developments.
translated by 谷歌翻译